Michael Luu, MPH
Biostatistics & Bioinformatics Research Center
Cedars Sinai Medical Center
August 13, 2022
| x | y |
|---|---|
| 55.3846 | 97.1795 |
| 51.5385 | 96.0256 |
| 46.1538 | 94.4872 |
| 42.8205 | 91.4103 |
| 40.7692 | 88.3333 |
| 38.7179 | 84.8718 |
| 35.6410 | 79.8718 |
| 33.0769 | 77.5641 |
| 28.9744 | 74.4872 |
| 26.1538 | 71.4103 |
| x | y |
|---|---|
| 58.21361 | 91.88189 |
| 58.19605 | 92.21499 |
| 58.71823 | 90.31053 |
| 57.27837 | 89.90761 |
| 58.08202 | 92.00815 |
| 57.48945 | 88.08529 |
| 28.08874 | 63.51079 |
| 28.08547 | 63.59020 |
| 28.08727 | 63.12328 |
| 27.57803 | 62.82104 |
| x | y |
|---|---|
| 38.33776 | 92.47272 |
| 35.75187 | 94.11677 |
| 32.76722 | 88.51829 |
| 33.72961 | 88.62227 |
| 37.23825 | 83.72493 |
| 36.02720 | 82.04078 |
| 39.23928 | 79.26372 |
| 39.78452 | 82.26057 |
| 35.16603 | 84.15649 |
| 40.62212 | 78.54210 |
| x | y |
|---|---|
| 55.99303 | 79.27726 |
| 50.03225 | 79.01307 |
| 51.28846 | 82.43594 |
| 51.17054 | 79.16529 |
| 44.37791 | 78.16463 |
| 45.01027 | 77.88086 |
| 48.55982 | 78.78837 |
| 42.14227 | 76.88063 |
| 41.02697 | 76.40959 |
| 34.57531 | 72.72484 |
| dataset | n | mean_x | sd_x | mean_y | sd_y |
|---|---|---|---|---|---|
| A | 142 | 54.26 | 16.76 | 47.84 | 26.94 |
| B | 142 | 54.26 | 16.76 | 47.84 | 26.94 |
| C | 142 | 54.26 | 16.76 | 47.84 | 26.94 |
| D | 142 | 54.26 | 16.76 | 47.84 | 26.94 |
It appears the counts (n), mean (x), mean (y), and sd (x) and sd (y) are identical for ALL four datasets!
The original “Datasaurus” or “dino” was created by Alberto Cairo in the following blog post
He was then later made famous by the paper published by Justin Matejka and George Fitzmaurize, titled ‘Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing’, where they simulated 12 additional datasets in addition to the original “Datasaurus” with nearly identical simple statistics
Biostatistics & Bioinformatics Research Center